Predictive Modeling of Weather Station Data:

Linear Regression vs. Graph Neural Network

Author

Colby Fenters & Lilith Holland (Advisor: Dr. Cohen)

Published

July 31, 2025

Slides: slides.html

Introduction

Accurate weather prediction is a crucial task with widespread implications across agriculture, transportation, disaster preparedness, and energy management. Traditional forecasting methods often rely on statistical models or physics-based simulations, however, with the advancement of graphical neural networks (GNN) we believe there is potential in a more modern deep learning approach [Lam et al. (2023)](Keisler 2022).

In this project, we explore the predictive power of a traditional linear regression model and a GNN on real-world weather station data (Herzmann 2023). Our aim is to evaluate whether the GNN’s ability to incorporate spatial relationships between stations offers a measurable advantage over more conventional techniques.

The dataset consists of multiple weather stations located within the same geographic region. Each station collects meteorological variables over time, and can be represented as a node within a broader spatial network. For the linear model baseline, a single model will be trained using all stations’ data aggregated per feature for each time step.

For the GNN the model will be trained on the entire network of stations, where each node corresponds to a station and edges represent spatial relationships. The graph is encoded via a dense adjacency matrix, excluding self-connections. The GNN aims to leverage the inherent spatial structure of the data, potentially capturing regional weather patterns and inter-station dependencies that are invisible to traditional models (Li et al. 2018).

Our evaluation focuses on forecasting performance over the last 6-months of the dataset. We asses how well each modelling approach predicts key weather variables and investigate the conditions under which one model may outperform the other.

Methods

1. Cleaning Process

The raw weather station data underwent a multistage cleaning and preprocessing procedures designed to ensure temporal consistency, handle missing values, and prepare the data for both linear and GNN-based models. The steps are as follows:

  • Temporal Alignment: Missing time steps were filled for all stations to ensure uniform time coverage across the selected date range.

  • Temporal Compression: Data was downsampled from 1-hour intervals to 6-hour intervals to reduce noise and improve model efficiency.

  • Feature Pruning: Features missing more than 10% of their values in over 6 our of 8 stations were removed.

  • Station Filtering: Stations missing more than two years of valid data were excluded to ensure consistency across nodes (some stations were newer than others).

  • Correlation Analysis: Performed both between-station and within-station correlation analysis to better understand the spatial and temporal dependencies among features.

  • Feature Scaling: Remaining features were scaled using appropriate normalization techniques (e.g., min-max, robust scaling) to facilitate model training.

  • Data Reshaping: The final dataset was transformed into a 3D array off shape (time, station, feature), serving as the unified input format for both the linear and GNN models.

2. Linear Model

The linear baseline model was designed as a univariate time-series regression task. It uses the preceding 28 time steps (equivalent to 7 days) to predict the target variable tmpf at the next time step.

Model Input:

At each time step t, the input vector aggregates the five base features across all stations:

\(\text{tmpf}_{t+1}=\text{features}_t+\text{features}_{t-1}+...+\text{features}_{t-27}\)

Where:

\(\text{features}_t=\text{\{tmpf, }\text{relh, }\text{sknt, }\text{drct}_{sin}\text{, drct}_{cos}\text{\}}\)

  • tmpf: Temperature
  • relh: Relative Humidity
  • sknt: Wind Speed
  • \(\text{drct}_{sin}\text{, drct}_{cos}\text{: }\)Wind Direction encoded as sine and cosine components

This flattened representation allows a standard linear regression model to be trained on a fixed-length vector for each prediction target and simplifies the graph structure of the weather stations.

3. GNN

The GNN is structured to model the spatiotemporal dynamics of the weather station network. It is implemented through PyTorch, the architecture is inspired by the Diffusion Convolutional Recurrent Neural Network (DCRNN) (Li et al. 2018).

Graph Structure and Input Format

  • Graph Representation: Each weather station is represented as a node, and spatial relationships are encoded in a dense adjacency matrix (excluding self-loops)

  • Data Format: The input is formatted using the StaticGraphTemporalSignal class from torch-geometric-temporal, which supports the temporal sequences on a static graph.

  • Train/Validation/Test Split:

    • Testing: Final 6 months of data (730 time steps at 6-hour resolution)

    • Training: First 80% of the remaining data

    • Validation: Final 20% of the remaining data

Model Architecture

The model consists of three stacked DCRNN layers followed by a linear projection layer:

DCRN(140, 64) -> ReLU -> DCRN(64, 32) -> ReLU -> DCRN(32, 32) -> ReLU -> Linear(32, 1)

  • DCRNN Layers: Capture both temporal patterns as well as spatial diffusion through the graph structure.

  • ReLU Activations: Introduce nonlinearity after each recurrent layer.

  • Linear Output: Maps the final hidden state to the predicted temperature at the next time step

Training Configuration:

  • Optimizer: Adam

  • Learning Rate: Initial rate of 0.01, reduced by a factor of 0.1 upon plateau, with a minimum of 10^-5

  • Epochs: Trained for up to 100 epochs with early stopping based on validation loss plateau

The GNN, like the linear model, uses the preceding 28 time steps to forecast the next temperature value. However, it is equipped to model both temporal trends and spatial interdependencies, which the linear model is not explicitly designed to capture.

Analysis and Results

Data Exploration and Visualization

1. Data Source:

The dataset used in this project was sourced from the Iowa Environmental Mesonet (IEM) hosted by the Iowa State University (Herzmann 2023). The data follows observational standards set by the FAA Automated Surface Observing System (ASOS) (Administration 2021). For this project, we selected a subset covering a 10-year period from 2010 to 2020, focusing on nine weather stations located in south eastern Kansas. These nodes were chosen based on geographic proximity and consistency of data availability.

2. Data Structure

The original dataset contains:

  • 33 features

  • 8 stations

  • 96,408 hourly time steps

  • Intermittent missing values across both time and stations

A subset of key features relevant to this project is summarized below:

While additional features are present in the dataset, many were excluded form analysis due to limited relevance or poor data quality.
Feature Description
station Station identifier code (3-4 characters)
valid Timestamp of the observation
lon Longitude
lat Latitude
elevation Elevation in feet
tmpf Air temperature (F)
relh Relative humidity (%)
drct Wind direction (degrees)
sknt Wind speed (knots)
p01i Precipitations (inches) over the previous hour
vsby visibility (miles)
shape: (5, 32)
station valid lat lon elevation tmpf dwpf relh drct sknt p01i alti mslp vsby gust skyc1 skyc2 skyc3 skyc4 skyl1 skyl2 skyl3 skyl4 wxcodes ice_accretion_1hr ice_accretion_3hr ice_accretion_6hr peak_wind_gust peak_wind_drct peak_wind_time feel snowdepth
cat datetime[μs] f32 f32 f32 f32 f32 f32 f32 f32 f32 f32 f32 f32 f32 cat cat cat cat f32 f32 f32 f32 str f32 f32 f32 f32 f32 datetime[μs] f32 f32
"HQG" 2010-09-16 18:00:00 37.163101 -101.370499 956.52002 null null null null null null null null null null null null null null null null null null null null null null null null null null null
"LBL" 2019-09-15 12:00:00 37.044201 -100.9599 879.0 64.5 57.666668 78.71167 198.333328 7.666667 0.0 30.139999 1017.383362 10.0 null null null null null null null null null null null null null null null null 64.48333 null
"GCK" 2016-03-10 18:00:00 37.927502 -100.724403 881.0 60.333332 29.833334 34.814999 152.0 7.166667 0.0 30.16 1020.68335 10.0 16.0 null null null null null null null null null null null null null null null 60.32 null
"GCK" 2019-09-19 00:00:00 37.927502 -100.724403 881.0 86.0 52.5 32.200001 186.0 4.166667 0.0 29.926666 1009.833313 10.0 null null null null null null null null null null null null null null null null 84.199997 null
"19S" 2016-08-27 00:00:00 37.496899 -100.832901 892.570007 null null null null null null null null null null null null null null null null null null null null null null null null null null null

3. Exploratory Data Analysis (EDA)

Initial exploratory analysis focused on filtering out low-quality data and reducing the dimensionality of the dataset:

  • Features with more than 10% missing values were removed.

  • Stations with excessive missing data during the 2010-2020 window were dropped. One station was excluded entirely as it was introduced after 2020, and thus had no data within the selected range.

  • The remaining dataset was evaluated to ensure sufficient temporal coverage and consistency across stations and features.

With this visual a date range was selected foe 2018 to 2020 as this range had the most valid features and stations while also being quite recent. As this range was selected the ULS station was dropped resulting in seven valid stations.

shape: (30_667, 12)
station valid lat lon elevation tmpf dwpf relh sknt feel drct_sin drct_cos
cat datetime[μs] f32 f32 f32 f32 f32 f32 f32 f32 f64 f64
"GCK" 2018-01-01 00:00:00 37.927502 -100.724403 881.0 9.283334 -6.5 48.148335 8.666667 -4.161667 0.851117 0.524977
"LBL" 2018-01-01 00:00:00 37.044201 -100.9599 879.0 12.316667 -1.983333 52.014999 10.166667 -1.721667 0.664796 0.747025
"EHA" 2018-01-01 00:00:00 37.000801 -101.879997 1099.0 15.555555 5.255556 63.382778 7.388889 4.482222 0.970763 0.24004
"HQG" 2018-01-01 00:00:00 37.163101 -101.370499 956.52002 14.311111 -1.605556 48.468334 7.777778 2.681667 0.936332 0.351115
"3K3" 2018-01-01 00:00:00 37.991699 -101.7463 1005.700012 13.1 -0.9 53.127777 6.777778 2.151111 0.981255 0.192712
"EHA" 2020-12-31 00:00:00 37.000801 -101.879997 1099.0 42.355556 16.588888 34.955555 3.0 40.254444 -0.533205 -0.845986
"HQG" 2020-12-31 00:00:00 37.163101 -101.370499 956.52002 40.722221 13.255555 32.23111 1.666667 39.450001 0.977334 -0.211704
"3K3" 2020-12-31 00:00:00 37.991699 -101.7463 1005.700012 40.200001 12.2 31.377777 4.555555 36.683334 -0.700217 -0.71393
"JHN" 2020-12-31 00:00:00 37.578201 -101.7304 1012.710022 40.711113 17.4 38.554443 4.777778 37.06889 -0.824675 -0.565607
"19S" 2020-12-31 00:00:00 37.496899 -100.832901 892.570007 39.922222 15.388889 36.552223 6.111111 35.113335 -0.824675 0.565607

4. Graph Creation

To prepare the dataset for graph-based modeling, a spatial graph was constructed:

  • Each station was treated as a node.

  • A dense adjacency matrix (excluding self-connections) was created by computing geodesic distances between stations.

  • Edge weights wwere defined as the inverse of the geodesic distance, scaled to a [0, 1] range using MinMax scaler. The closer two stations are, the stronger their connection in the graph.

5. Spatiotemporal Imputation

Missing values were imputed through a two-stage process leveraging both spatial and temporal structure:

  1. Spatial Imputation: Each missing value was estimated based on the value of neighboring nodes within the same time step, weighted by graph connectivity.

  2. Temporal Imputation: Remaining gaps were filled by interpolating along the time axis for each node individually.

While not a perfect method, this approach produced plausible and continuous data, as visually confirmed during quality checks and shown below.

6. Correlation Analysis

To avoid feature redundancy and data leakage, a correlation analysis was conducted:

  • Inter-node and Intra-node correlations were computed.

  • Two features, dwpf (dew point in f) and feel (feels-like temperature in f), showed high correlation with the target variable tmpf (temperature in f). Since these variables are partially derived from the target variable tmpf, they were removed to maintain model integrity and avoid leakage.

7. Final Preparation

After all preprocessing steps, the final dataset was reduced and standardized:

  • 5 features were retained: tmpf, relh, sknt, drct_sin, drct_cos

  • 7 stations remained after filtering

  • 4,381 time steps at 6-hour intervals (equivalent to 2 years of data)

The remaining unscaled features were scaled using the RobusScaler from scikit-learn to mitigate the influence of outliers while preserving overall data distribution.

shape: (30_667, 6)
┌─────────┬───────────┬──────────┬───────────┬───────────┬───────────┐
│ station ┆ tmpf      ┆ relh     ┆ sknt      ┆ drct_sin  ┆ drct_cos  │
│ ---     ┆ ---       ┆ ---      ┆ ---       ┆ ---       ┆ ---       │
│ cat     ┆ f64       ┆ f64      ┆ f64       ┆ f64       ┆ f64       │
╞═════════╪═══════════╪══════════╪═══════════╪═══════════╪═══════════╡
│ GCK     ┆ -1.355721 ┆ 0.481483 ┆ -0.080357 ┆ 0.851117  ┆ 0.524977  │
│ LBL     ┆ -1.265174 ┆ 0.52015  ┆ 0.160714  ┆ 0.664796  ┆ 0.747025  │
│ EHA     ┆ -1.168491 ┆ 0.633828 ┆ -0.285714 ┆ 0.970763  ┆ 0.24004   │
│ HQG     ┆ -1.205638 ┆ 0.484683 ┆ -0.223214 ┆ 0.936332  ┆ 0.351115  │
│ 3K3     ┆ -1.241791 ┆ 0.531278 ┆ -0.383929 ┆ 0.981255  ┆ 0.192712  │
│ …       ┆ …         ┆ …        ┆ …         ┆ …         ┆ …         │
│ EHA     ┆ -0.368491 ┆ 0.349556 ┆ -0.991072 ┆ -0.533205 ┆ -0.845986 │
│ HQG     ┆ -0.417247 ┆ 0.322311 ┆ -1.205357 ┆ 0.977334  ┆ -0.211704 │
│ 3K3     ┆ -0.432836 ┆ 0.313778 ┆ -0.741072 ┆ -0.700217 ┆ -0.71393  │
│ JHN     ┆ -0.417579 ┆ 0.385544 ┆ -0.705357 ┆ -0.824675 ┆ -0.565607 │
│ 19S     ┆ -0.441128 ┆ 0.365522 ┆ -0.491072 ┆ -0.824675 ┆ 0.565607  │
└─────────┴───────────┴──────────┴───────────┴───────────┴───────────┘

Modeling and Results

Model 1: Graph Neural Network (DCRNN)

The Graph Neural Network was trained using the previous 28 time steps (equivalent to 7 days) and leveraged a dense spatial graph connecting all stations. This structure enabled the model to learn boht temporal sequences and spatial diffusion patters across the weather station network.

  • Graph Structure: Dense graph with edge weights based on inverse geodesic distance

  • Architecture: Three stacked DCRNN layers + ReLU activations + Linear projection

  • Target: Next-step temperature prediction for each node

Results:

MSE: 0.0562

It is visually apparent that the model is able to follow the temporal trends of the weather data with little latency in predictive results.

However, it is also apparent that there is a weird latency in the results of the prediction as an absolute error of 1 is quite extreme given the total range of the test features is -1.5 to 1.5.

Model 2: Linear Regression

The linear baseline model was trained using the same 28-time-step history with aggregated weather station data to predict the next time step’s temperature. All features were flattened into a single vector, treating the problem as a high-dimensional regression task with no spatial awareness.

  • Strengths: Simplicity, interpretability, low training time

  • Weakness: Cannot leverage inter-station relationships or dynamic spatial trends.

MSE: 0.0147

It is visually apparent that the model performs quite a bit better than the GNN as there is little latency in the predicted results.

It is also clear that the maximum absolute error is 0.8 which is quite a bit less than the 1 for the GNN model.

When both models are compared against eachother per station it becomes apparent how poorly the GNN is performing on these predictive tasks.

Key Findings and Interpretation

  1. Small Graph structures do not matter: The GNN severely underperformed the linear model in both MSE and absolute error across all stations, highlighting the power of a standard linear model on a small scale weather system.

  2. Temporal Context is Crucial: Both models benefitted from the use of 28 historical time steps, suggesting that short-term temporal trends are strong predictors of near-future temperature.

  3. Feature Engineering Adds Value: Replacing wind direction with sin/cosine components improved learning stability and reduced directional ambiguity.

  4. Graph Structure is Important: The small scale of the graph most likely played a significant role in the underperformance of the GNN model.

  5. Static Graphs are Restrictive: Not applying and encoding and decoding layer to the GNN most likely limited information transfer between nodes, however, this may not have been important for such a small system.

Conclusion

Summary of Key Results

  • A Graph Neural Network trained on a spatiotemporal weather graph may not outperform a standard linear regression baseline for short-term temperature prediction.

  • Incorporating spatial structure through graph edges enabled the model to learn regional weather interactions that linear models could not.

  • Careful data preprocessing, including imputation’s scaling, and circular feature handling, was essential to achieving strong performance from both models.

  • GNNs are incredibly sensitive to parameter tuning and may outperform if provided a much larger model structure.

Implications and Future Work

These findings demonstrate the potential of traditional models compared to graph-based deep learning approaches, however, it is also apparent that the reliance on aggregating stations as is done with the linear model most likely only worked due to the close proximity of the stations. If the dataset was instead made from all of the stations across the US I doubt it would be possible to aggregate in a way that still preserves spatial information. As this is the case we still believe there is potential in the application of a graph-based deep learning approach when it comes to large scale weather forecasting.

Future improvements could include:

  • Proper multivariate analysis

  • Large geographic area

  • Application of an encoding-decoding step

  • More advanced GNN variants

  • Exploring other imputation techniques

References

Administration, Federal Aviation. 2021. “Automated Surface Observing System.” https://www.weather.gov/media/asos/aum-toc.pdf.
Herzmann, Daryl. 2023. “IEM:: Download ASOS/AWOS/Metar Data.” Iowa Environmental Mesonet. https://www.mesonet.agron.iastate.edu/request/download.phtml?network=KS_ASOS.
Keisler, Ryan. 2022. “Forecasting Global Weather with Graph Neural Networks.” https://arxiv.org/abs/2202.07575.
Lam, Remi, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, et al. 2023. “GraphCast: Learning Skillful Medium-Range Global Weather Forecasting.” https://arxiv.org/abs/2212.12794.
Li, Yaguang, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. “Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting.” https://arxiv.org/abs/1707.01926.